synthetic population
An Adaptive, Data-Integrated Agent-Based Modeling Framework for Explainable and Contestable Policy Design
Multi-agent systems often operate under feedback, adaptation, and non-stationarity, yet many simulation studies retain static decision rules and fixed control parameters. This paper introduces a general adaptive multi-agent learning framework that integrates: (i) four dynamic regimes distinguishing static versus adaptive agents and fixed versus adaptive system parameters; (ii) information-theoretic diagnostics (entropy rate, statistical complexity, and predictive information) to assess predictability and structure; (iii) structural causal models for explicit intervention semantics; (iv) procedures for generating agent-level priors from aggregate or sample data; and (v) unsupervised methods for identifying emergent behavioral regimes. The framework offers a domain-neutral architecture for analyzing how learning agents and adaptive controls jointly shape system trajectories, enabling systematic comparison of stability, performance, and interpretability across non-equilibrium, oscillatory, or drifting dynamics. Mathematical definitions, computational operators, and an experimental design template are provided, yielding a structured methodology for developing explainable and contestable multi-agent decision processes.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Greenland (0.04)
- (5 more...)
- Research Report (1.00)
- Overview (1.00)
- Health & Medicine (1.00)
- Government (1.00)
- Energy > Power Industry (1.00)
Population synthesis with geographic coordinates
Lenti, Jacopo, Costantini, Lorenzo, Fosch, Ariadna, Monticelli, Anna, Scala, David, Pangallo, Marco
It is increasingly important to generate synthetic populations with explicit coordinates rather than coarse geographic areas, yet no established methods exist to achieve this. One reason is that latitude and longitude differ from other continuous variables, exhibiting large empty spaces and highly uneven densities. To address this, we propose a population synthesis algorithm that first maps spatial coordinates into a more regular latent space using Normalizing Flows (NF), and then combines them with other features in a Variational Autoencoder (VAE) to generate synthetic populations. This approach also learns the joint distribution between spatial and non-spatial features, exploiting spatial autocorrelations. We demonstrate the method by generating synthetic homes with the same statistical properties of real homes in 121 datasets, corresponding to diverse geographies. We further propose an evaluation framework that measures both spatial accuracy and practical utility, while ensuring privacy preservation. Our results show that the NF+VAE architecture outperforms popular benchmarks, including copula-based methods and uniform allocation within geographic areas. The ability to generate geolocated synthetic populations at fine spatial resolution opens the door to applications requiring detailed geography, from household responses to floods, to epidemic spread, evacuation planning, and transport modeling.
- Europe > Italy > Piedmont > Turin Province > Turin (0.05)
- North America > United States > District of Columbia > Washington (0.04)
- Europe > Spain > Aragón > Zaragoza Province > Zaragoza (0.04)
- Europe > Italy > Lazio > Rome (0.04)
- Information Technology > Security & Privacy (1.00)
- Banking & Finance (1.00)
- Health & Medicine (0.68)
MICROTRIPS: MICRO-geography TRavel Intelligence and Pattern Synthesis
Wang, Yangyang, Fabusuyi, Tayo
This study presents a novel small-area estimation framework to enhance urban transportation planning through detailed characterization of travel behavior. Our approach improves on the four-step travel model by employing publicly available microdata files and machine learning methods to predict travel behavior for a representative, synthetic population at small geographic areas. This approach enables high-resolution estimation of trip generation, trip distribution, mode choice, and route assignment. Validation using ACS/PUMS work-commute datasets demonstrates that our framework achieves higher accuracy compared to conventional approaches. The resulting granular insights enable the tailoring of interventions to address localized situations and support a range of policy applications and targeted interventions, including the optimal placement of micro-fulfillment centers, effective curb-space management, and the design of more inclusive transportation solutions particularly for vulnerable communities.
- Pacific Ocean > North Pacific Ocean > Puget Sound (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > New York (0.04)
- (2 more...)
- Government > Regional Government > North America Government > United States Government (0.95)
- Transportation > Infrastructure & Services (0.94)
- Transportation > Ground > Road (0.93)
Target Population Synthesis using CT-GAN
Rastogi, Tanay, Jonsson, Daniel
Agent-based models used in scenario planning for transportation and urban planning usually require detailed population information from the base as well as target scenarios. These populations are usually provided by synthesizing fake agents through deterministic population synthesis methods. However, these deterministic population synthesis methods face several challenges, such as handling high-dimensional data, scalability, and zero-cell issues, particularly when generating populations for target scenarios. This research looks into how a deep generative model called Conditional Tabular Generative Adversarial Network (CT-GAN) can be used to create target populations either directly from a collection of marginal constraints or through a hybrid method that combines CT-GAN with Fitness-based Synthesis Combinatorial Optimization (FBS-CO). The research evaluates the proposed population synthesis models against travel survey and zonal-level aggregated population data. Results indicate that the stand-alone CT-GAN model performs the best when compared with FBS-CO and the hybrid model. CT-GAN by itself can create realistic-looking groups that match single-variable distributions, but it struggles to maintain relationships between multiple variables. However, the hybrid model demonstrates improved performance compared to FBS-CO by leveraging CT-GAN ability to generate a descriptive base population, which is then refined using FBS-CO to align with target-year marginals. This study demonstrates that CT-GAN represents an effective methodology for target populations and highlights how deep generative models can be successfully integrated with conventional synthesis techniques to enhance their performance.
Population Synthesis using Incomplete Information
Rastogi, Tanay, Jonsson, Daniel, Karlström, Anders
This paper presents a population synthesis model that utilizes the Wasserstein Generative-Adversarial Network (WGAN) for training on incomplete microsamples. By using a mask matrix to represent missing values, the study proposes a WGAN training algorithm that lets the model learn from a training dataset that has some missing information. The proposed method aims to address the challenge of missing information in microsamples on one or more attributes due to privacy concerns or data collection constraints. The paper contrasts WGAN models trained on incomplete microsamples with those trained on complete microsamples, creating a synthetic population. We conducted a series of evaluations of the proposed method using a Swedish national travel survey. We validate the efficacy of the proposed method by generating synthetic populations from all the models and comparing them to the actual population dataset. The results from the experiments showed that the proposed methodology successfully generates synthetic data that closely resembles a model trained with complete data as well as the actual population. The paper contributes to the field by providing a robust solution for population synthesis with incomplete data, opening avenues for future research, and highlighting the potential of deep generative models in advancing population synthesis capabilities.
- Health & Medicine (0.46)
- Information Technology > Security & Privacy (0.34)
Cultural Bias in Large Language Models: Evaluating AI Agents through Moral Questionnaires
Are AI systems truly representing human values, or merely averaging across them? Our study suggests a concerning reality: Large Language Models (LLMs) fail to represent diverse cultural moral frameworks despite their linguistic capabilities. We expose significant gaps between AI-generated and human moral intuitions by applying the Moral Foundations Questionnaire across 19 cultural contexts. Comparing multiple state-of-the-art LLMs' origins against human baseline data, we find these models systematically homogenize moral diversity. Surprisingly, increased model size doesn't consistently improve cultural representation fidelity. Our findings challenge the growing use of LLMs as synthetic populations in social science research and highlight a fundamental limitation in current AI alignment approaches. Without data-driven alignment beyond prompting, these systems cannot capture the nuanced, culturally-specific moral intuitions. Our results call for more grounded alignment objectives and evaluation metrics to ensure AI systems represent diverse human values rather than flattening the moral landscape.
- Europe > Belgium (0.04)
- South America > Peru (0.04)
- South America > Colombia (0.04)
- (22 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.88)
A Large Language Model for Feasible and Diverse Population Synthesis
Lim, Sung Yoo, Yun, Hyunsoo, Bansal, Prateek, Kim, Dong-Kyu, Kim, Eui-Jin
Generating a synthetic population that is both feasible and diverse is crucial for ensuring the validity of downstream activity schedul e simulation in activity - based models (ABMs) . While deep generative models (DGMs), such as variational autoencoders and g enerative adversarial networks, have been applied to this task, they often struggle to balance the inclusion of rare but plausible combinations (i.e., sampling zeros) with the exclusion of implausible ones (i.e., structural zeros). To improve feasibility while maintaining diversity, we propose a fine - tuning method for large language models (LLMs) that explicitly controls the autoregressive generation process through topological orderings derived from a Bayesian Network (BN). Experimental result s show that our hybrid LLM - BN approach outperform s both traditional DGMs and proprietary LLMs (e.g., ChatGPT - 4o) with few - shot learning. Specifically, our approach achieves approximately 95% feasibility -- significantly higher than the ~80% observed in DGMs -- w hile maintaining comparable diversity, making it well - suited for practical applications. Importantly, the method is based on a lightweight open - source LLM, enabling fine - tuning and inference on standard personal computing environments. This makes the appro ach cost - effective and scalable for large - scale applications, such as synthesizing populations in megacities, without relying on expensive infrastructure. By initiating the ABM pipeline with high - quality synthetic populations, our method improves overall s imulation reliability and reduces downstream error propagation. The source code for these methods is available for research and practical application.
- Asia > South Korea > Seoul > Seoul (0.05)
- Asia > Singapore > Central Region > Singapore (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
A Generic Modelling Framework for Last-Mile Delivery Systems
Gürcan, Önder, Szczepanska, Timo, Falck, Vanja, Antosz, Patrycja, Cebeci, Merve Seher, de Bok, Michiel, Tapia, Rodrigo, Tavasszy, Lóránt
Large-scale social digital twinning projects are complex with multiple objectives. For example, a social digital twinning platform for innovative last-mile delivery solutions may aim to assess consumer delivery method choices within their social environment. However, no single tool can achieve all objectives. Different simulators exist for consumer behavior and freight transport. Therefore, we propose a high-level architecture and present a blueprint for a generic modelling framework. This includes defining modules, input/output data, and interconnections, while addressing data suitability and compatibility risks. We demonstrate the framework's effectiveness with two real-world case studies.
- Europe > Netherlands > South Holland (0.15)
- Europe > Poland (0.14)
- Europe > Norway (0.14)
- Europe > Italy (0.14)
- Transportation > Infrastructure & Services (0.50)
- Transportation > Freight & Logistics Services (0.47)
- Health & Medicine > Therapeutic Area (0.31)
Generating Spatial Synthetic Populations Using Wasserstein Generative Adversarial Network: A Case Study with EU-SILC Data for Helsinki and Thessaloniki
Using agent-based social simulations can enhance our understanding of urban planning, public health, and economic forecasting. Realistic synthetic populations with numerous attributes strengthen these simulations. The Wasserstein Generative Adversarial Network, trained on census data like EU-SILC, can create robust synthetic populations. These methods, aided by external statistics or EU-SILC weights, generate spatial synthetic populations for agent-based models. The increased access to high-quality micro-data has sparked interest in synthetic populations, which preserve demographic profiles and analytical strength while ensuring privacy and preventing discrimination. This study uses national data from Finland and Greece for Helsinki and Thessaloniki to explore balanced spatial synthetic population generation. Results show challenges related to balancing data with or without aggregated statistics for the target population and the general under-representation of fringe profiles by deep generative methods. The latter can lead to discrimination in agent-based simulations.
- Europe > Finland > Uusimaa > Helsinki (0.67)
- Europe > Greece > Central Macedonia > Thessaloniki (0.65)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Norway > Southern Norway > Agder > Kristiansand (0.04)
Agent-Based Modelling of Older Adult Needs for Autonomous Mobility-on-Demand: A Case Study in Winnipeg, Canada
As the populations continue to age across many nations, ensuring accessible and efficient transportation options for older adults has become an increasingly important concern. Autonomous Mobility-on-Demand (AMoD) systems have emerged as a potential solution to address the needs faced by older adults in their daily mobility. However, estimation of older adult mobility needs, and how they vary over space and time, is crucial for effective planning and implementation of such service, and conventional four-step approaches lack the granularity to fully account for these needs. To address this challenge, we propose an agent-based model of older adults mobility demand in Winnipeg, Canada. The model is built for 2022 using primarily open data, and is implemented in the Multi-Agent Transport Simulation (MATSim) toolkit. After calibration to accurately reproduce observed travel behaviors, a new AMoD service is tested in simulation and its potential adoption among Winnipeg older adults is explored. The model can help policy makers to estimate the needs of the elderly populations for door-to-door transportation and can guide the design of AMoD transport systems.
- North America > Canada > Manitoba > Winnipeg Metropolitan Region > Winnipeg (0.94)
- Europe > France > Île-de-France (0.04)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- (12 more...)
- Transportation > Passenger (1.00)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- (4 more...)